Preprocess the Data

Setup Visualization Labels

Create Simple Random Forest Model

Prepare SHAP Data that Explains the Random Forest

Violin Bee-Summary Plot

This is a standard violin plot but with outliers drawn as points. This gives a more accurate representation of the density out the outliers than a kernel density estimated from so few points. The color represents the average feature value at that position, so red regions have mostly high valued feature values while blue regions have mostly low feature values.

By changing the label to something else we can see what features on the left have a higher impact on the model predicting this label. For example, when predicing the Protocol Name of "99TAXI" we see the feature L7Protocol has little importance on predicting this Protocol. However, the feature Forward Packet Length does.

Stacked Importance Plot

Again, by changing the predicted label to something else we can see what features are significant and not significant, like from the previous visualization. However, we cal also interact with different features and compairsons from an additional labels on the top and to the left of the visualization. We can see that when both the highlighted features equal a particular value what the other features we are comparing to on top may equal.

Bar Plot

This plot shows the proportional impact of each feature using mean absolute value SHAP values. The larger the number, the higher the impact a feature has on the model's categorization.

Visualize a Single Prediction

Now interact with the dropdowns at the top to select the feature and see the values of that feature that may predict the label above.

Non-XAI Visualization